Stan (software)
   HOME

TheInfoList



OR:

Stan is a
probabilistic programming language Probabilistic programming (PP) is a programming paradigm in which probabilistic models are specified and inference for these models is performed automatically. It represents an attempt to unify probabilistic modeling and traditional general pur ...
for
statistical inference Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution, distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical ...
written in
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
.Stan Development Team. 2015
Stan Modeling Language User's Guide and Reference Manual, Version 2.9.0
/ref> The Stan language is used to specify a (Bayesian)
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repres ...
with an imperative program calculating the log
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
. Stan is licensed under the
New BSD License BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD li ...
. Stan is named in honour of
Stanislaw Ulam Stanisław Marcin Ulam (; 13 April 1909 – 13 May 1984) was a Polish-American scientist in the fields of mathematics and nuclear physics. He participated in the Manhattan Project, originated the Teller–Ulam design of thermonuclear weapon ...
, pioneer of the
Monte Carlo method Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be determi ...
. Stan was created by a development team consisting of 34 members that includes
Andrew Gelman Andrew Eric Gelman (born February 11, 1965) is an American statistician and professor of statistics and political science at Columbia University. Gelman received bachelor of science degrees in mathematics and in physics from MIT, where he was ...
, Bob Carpenter, Matt Hoffman, and Daniel Lee.


Interfaces

The Stan language itself can be accessed through several interfaces: * CmdStan – a command-line executable for the
shell Shell may refer to: Architecture and design * Shell (structure), a thin structure ** Concrete shell, a thin shell of concrete, usually with no interior columns or exterior buttresses ** Thin-shell structure Science Biology * Seashell, a hard o ...
, * CmdStanR and rstan – R software libraries, * CmdStanPy and PyStan – libraries for the
Python programming language Python is a high-level, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation. Python is dynamically-typed and garbage-collected. It supports multiple programming p ...
, * MatlabStan – integration with the
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation ...
numerical computing environment, * Stan.jl – integration with the
Julia programming language Julia is a high-level, dynamic programming language. Its features are well suited for numerical analysis and computational science. Distinctive aspects of Julia's design include a type system with parametric polymorphism in a dynamic programm ...
, * StataStan – integration with Stata. In addition, higher-level interfaces are provided with packages using Stan as backend, primarily in the
R language R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinfor ...
: * ''rstanarm'' provides a drop-in replacement for frequentist models provided by base R and ''lme4'' using the R formula syntax; * ''brms'' provides a wide array of linear and nonlinear models using the R formula syntax; * ''prophet'' provides automated procedures for
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Exa ...
forecasting Forecasting is the process of making predictions based on past and present data. Later these can be compared (resolved) against what happens. For example, a company might estimate their revenue in the next year, then compare it against the actual ...
.


Algorithms

Stan implements gradient-based
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
(MCMC) algorithms for Bayesian inference, stochastic, gradient-based
variational Bayesian methods Variational Bayesian methods are a family of techniques for approximating intractable integrals arising in Bayesian inference and machine learning. They are typically used in complex statistical models consisting of observed variables (usually ...
for approximate Bayesian inference, and gradient-based
optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...
for penalized maximum likelihood estimation. * MCMC algorithms: **
Hamiltonian Monte Carlo The Hamiltonian Monte Carlo algorithm (originally known as hybrid Monte Carlo) is a Markov chain Monte Carlo method for obtaining a sequence of random samples which converge to being distributed according to a target probability distribution for whi ...
(HMC) ** No-U-Turn sampler (NUTS), a variant of HMC and Stan's default MCMC engine * Variational inference algorithms: ** Automatic Differentiation Variational Inference * Optimization algorithms: **
Limited-memory BFGS Limited-memory BFGS (L-BFGS or LM-BFGS) is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS) using a limited amount of computer memory. It is a popular alg ...
(Stan's default optimization algorithm) **
Broyden–Fletcher–Goldfarb–Shanno algorithm In numerical optimization, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative method for solving unconstrained nonlinear optimization problems. Like the related Davidon–Fletcher–Powell method, BFGS determines the ...
**
Laplace's method In mathematics, Laplace's method, named after Pierre-Simon Laplace, is a technique used to approximate integrals of the form :\int_a^b e^ \, dx, where f(x) is a twice-differentiable function, ''M'' is a large number, and the endpoints ''a'' an ...
for classical standard error estimates and approximate Bayesian posteriors


Automatic differentiation

Stan implements reverse-mode
automatic differentiation In mathematics and computer algebra, automatic differentiation (AD), also called algorithmic differentiation, computational differentiation, auto-differentiation, or simply autodiff, is a set of techniques to evaluate the derivative of a function s ...
to calculate gradients of the model, which is required by HMC, NUTS, L-BFGS, BFGS, and variational inference. The automatic differentiation within Stan can be used outside of the probabilistic programming language.


Usage

Stan is used in fields including social science, pharmaceutical statistics,
market research Market research is an organized effort to gather information about target markets and customers: know about them, starting with who they are. It is an important component of business strategy and a major factor in maintaining competitiveness. Mark ...
, and
medical imaging Medical imaging is the technique and process of imaging the interior of a body for clinical analysis and medical intervention, as well as visual representation of the function of some organs or tissues (physiology). Medical imaging seeks to rev ...
.


References


Further reading

* * Gelman, Andrew, Daniel Lee, and Jiqiang Guo (2015).
Stan: A probabilistic programming language for Bayesian inference and optimization
Journal of Educational and Behavioral Statistics. * Hoffman, Matthew D., Bob Carpenter, and Andrew Gelman (2012)
Stan, scalable software for Bayesian modeling
, Proceedings of the NIPS Workshop on Probabilistic Programming.


External links


Stan web site

Stan source
a
Git Git () is a distributed version control system: tracking changes in any set of files, usually used for coordinating work among programmers collaboratively developing source code during software development. Its goals include speed, data in ...
repository hosted on
GitHub GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous ...
{{Statistical software Computational statistics Free Bayesian statistics software Monte Carlo software Numerical programming languages Domain-specific programming languages Probabilistic software